An estimation of parameters of a multivariate Gaussian Mixture Modelis usually based on a criterion (e.g. Maximum Likelihood) that is focused mostlyon training data. Therefore, testing data, which were not seen during the trainingprocedure, may cause problems. Moreover, numerical instabilities can occur(e.g. for low-occupied Gaussians especially when working with full-covariancematrices in high-dimensional spaces). Another question concerns the number ofGaussians to be trained for a specific data set. The approach proposed in this papercan handle all these issues. It is based on an assumption that the training andtesting data were generated from the same source distribution. The key part ofthe approach is to use a criterion based on the source distribution rather than usingthe training data itself. It is shown how to modify an estimation procedure inorder to fit the source distribution better (despite the fact that it is unknown), andsubsequently new estimation algorithm for diagonal- as well as full-covariancematrices is derived and tested.
展开▼